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Amendment to the Cla-lma : 

This listing of claitns replaces all prior versions, and 



1? 



listings, of claimg in the application: 



/l. (Currently amended) A method comprising: 

receiving audio data having a beat; 

forming beat data baaed on said audio data; 

determining a gesture window within which a gesture should 
occur, based on a specified time window relative to gaid beat 
data; 

playing aaid audio data and obtaining video data during a 
time that said audio data is being played; 

segmenting said video data to create a video clip baocd on 
timing data that indicatoo a opccifitid timing within a gGOturo 
will ocour of time including speoified timing window ; and 

automatically determining information related to a gesture 
occurring in the video clip only ^ within the specified timing 
window . 



2, (Currently amended) The method of claim 1, wherein said 
determining includes determining a probability that each of a 
plurality of predefined gestures which are performe d in the 
video clip containo the prcdofinGd gooturo within the timing 
window . 
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3* (Original) The method of claim 2, wherein determining 
the probability that the video clip contains each of the 
predefined gesture includes evaluations of Hidden Markov Models. 

/4-6. (Canceled) 

7. (Original) The method of claim 1, further conprising 
displaying a target gesture to be performed by the subject of 
the video data« 

8. (Original) The method of claim 1, wherein each video 
clip contains video frames, 

9. (Currently amended) The method of claim [[111 
further comprising identifying moving regions in each video 
frame in the video clip. 

10. (Original) The method of claim 9, further comprising 
generating a feature vector for each video frame of the video 
clip. 
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11. (Currently Amended) The method of claim 1, further 
comprising generating a score baeed on whether the video clip 
contains ^fee a target gesture. 



12, (Original) The method o£ claim 11, further comprising 



13, (Currently amended) The method of claim [[1]] 11, 
wherein determining if the video clip contains the pr - odcf incd a 
target gesture includes generating a gesture probability vector 
having a plurality of elements, each element being associated 
with one of a plurality of predefined gestures and representing 
a probability that the video clip contains each of the 
associated predefined gestures . 

/14. (Currently amended) A system comprising: 
an audio party receiving audio data having a beat and 

forming beat data based on said audio data; 

a processor, determining a gesture window within which a 

gesture should occur, based on a specified time window relative 

to said beat data; 

a temporal segmentor connected to receive video data during 

a time that said audio signal is being produced and to create a 



displaying the score. 




4 

Received from < 1 858 678 5099 > at 10130/03 /:07:09 PM (Eastern Standard Time] 



10/30/2003 17:03 FAX 1 858 




8 5038 



FISH AND RICHARDSON 




0006/015 



Attorney Docket No, 10559-195001 

Serial No. 09/662,679 

Amendment dated October 30, 2003 

Reply to Office Action dated July 30, 2003 

video clip from the video data baood on timing data^that 
indicat - oa a Gpocificd - timing within which a gcoturQ will oocui? 
of time including aaid apecified time window ; and 

a recognition engine, in communication with the temporal 
segmentor, to determine if the video clip contains a predefined 
gesture, only ^ within the specified timing window . 



15, (Original) The system of claim 14, wherein the 
recognition engine includes a plurality of Hidden Markov Models, 

16, (Currently amended) The system of claim 14, farther 
comprising; 

a timing data soutggj — i - n communioation with the toirporal 
oogmontorj i to provide the timing data to the temporal ocgmcntor, 
and 

a video source, in communication with the tettiporal 
segmentor, to provide the video data to the teTT5>oral segmentor. 

17, (Original) The system of claim 14, further compriaing a 
move subsystem, in communication with the timing data source, to 
provide a target gesture to be performed by the subject of the 
video data. 
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18. (Original) The system of claim 11, wherein the target 
gesture is a dance move that is to be performed by the subject 
of the video data. 

19- (Original) The system of claim 17, further comprising a 
scoring subsystem, in communication with the recognition engine 
and the move subsystem, to determine if the video clip contains 
the target gesture. 



20. (Original) The system of claim 19, further comprising a 
display subsystem, in communication with the scoring subsystem, 
to display a score that is a function of whether the video clip 
contains the target gesture. 

21. (Original) The system of claim 20, wherein the display 
s\ibsystem is in communication with the move subsystem and is 
configured to display a gesture request based on the target 
gesture . 

22. (Original) The system of claim 14, wherein the 
recognition engine is configured to recognize predefined 
gestures and to produce a gesture probability vector having 
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elements, each element being associated with one of the 
predefined gestures and representing the probability that the 
video clip contains the associated predefined gesture. 

/23-25, (Canceled) 

/26. (Currently amended) A computer program product, 
tangibly stored on a computer-readable medium, for recognizing 
gestures contained in video data, comprising instructions 
operable to cause a prograraniable processor to: 
receive audio data having a beat; 
form beat data based on said audio data; 
^ determine a cresture window within which a gesture should 

QGCur, based on a specified time window relative to said beat 
data; 

obtain video data during a time that said audio signal ia 
being produced; 

segment 'fefee said video data to create a video clip baocd on 
timing data that indioatca a opooifiod timing within which a 
gesture will occur of the time including said specified timing 
window ; and 

automatically determine if the video clip contains a 
predefined gesture within the specified timing window . 
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/27. (Canceled) 




(Currently amended) An audio-visual processing system 

including ; 

a video source to provide video data; 

an audio source to provide audio data having a beat ; 

a speaker to play at least a portion of the audio data; and 

a con^uter program product, tangibly stored on a computer- 
readable medium, for recognizing gestures contained in video 
data, comprising instructions operable to cause a programmable 



processor, in communication with the video source and the audio 



should occur, based on a specified time window rela tive to said 
beat data; 

obtain video data during a time that said audio signal is 
being produced; 

segment ^fee said video data to create a video clip based on 
said beat data; and 




source, to: 



extract beat data from the audio data; 



determine a gesture window within which a gesture 
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automatdcally determine if the video clip contains a 
predefined gesture within only within a specified timing window 
related to said beat data- 



29. (Currently amended) The vidoo processing system of 
claim 28, wherein the computer program product further includes 
instructions operable to cause the programmable processor to: 

perform a Hidden Markov Model process to determine if the 
video clip contains the predefined gesture. 

30. (Currently amended) The video processing system of 
claim 28, further comprising a display to display information 
based on whether the video clip contains the predefined gesture. 
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